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A Time- Symmetric Block Time-Step 
Algorithm for N-Body Simulations 

Junichiro Makino \ Piet Hut^, Murat Kaplan^, Hasan Saygm 

Abstract 



The method of choice for integrating the equations of motion of the general N- 
body problem has been to use an individual time step scheme. For the sake of 
efficiency, block time steps have been the most popular, where all time step sizes are 
smaller than a maximum time step size by an integer power of two. We present the 

Cn ■ first successful attempt to construct a time-symmetric integration scheme, based on 

block time steps. We demonstrate how our scheme shows a vastly better long-time 

t~^ , behavior of energy errors, in the form of a random walk rather than a linear drift. 

£_2 ' Increasing the number of particles makes the improvement even more pronounced. 

o 

^ . 
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.. ' 1 Introduction 



For long-term N-body simulations, it is essential that the drift in the values 
of conserved quantities is kept to a minimum. The total energy is often used 
as an indicator of such a drift. During the last fifteen years, two approaches 
have been put forward to improve numerical conservation of energy and other 
theoretically conserved quantities: symplectic integration schemes, where the 
simulated system is guaranteed to follow a slightly perturbed Hamiltonian 
system, and time-symmetric integration schemes, where the simulated system 
follows the same trajectory in phase space, when run backward or forward 
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In both cases, for syinplectic as well as for time-symmetric schemes, the intro- 
duction of adaptive time steps tends to destroy the desired properties. Sym- 
plectic schemes are perturbed to different Hamiltonians at different choices of 
time step length, and therefore lose their global symplecticity. Time-symmetric 
schemes typically determine their time step length at the beginning of a step, 
which implies tha t running a step backward gives a slightly different length 
for that step. See iLeimkuhler fc ReichI ( 20051 ) for a recent review of various 



attempts to remedy this situation. 

In practice, many large-scale simulations in stellar dynamics use a block time- 
step approach, where the on ly allowed values for the time step length are 
powers of two ( Aarsethl . l2003h . The name derives from the fact that, with this 



recipe, many particles will share the same step size, which implies that their 
orbit integration can be performed in parallel. An added benefit, in the case 
of individual time steps, is the fact that block time steps allow one to predict 
the positions of all particles only once per block time step, rather than sepa- 
rately for each particle that needs to be moved forward. Since parallelization is 
rapidly becoming essential for any major simulation, we explore in this paper 
the possibility to extend time symmetry to the use of block time steps. 

In Section 2, we analyze some of the problems that occur when applying 
existing methods to the case of block time steps, and we offer a novel solution, 
with a truly time symmetric choice of time step, with the restriction that we 
only allow changes of a factor two in the direction of increasing and decreasing 
the time step. In Section 3, we present numerical tests of our new scheme in 
the simplest case of the 2-body problem, which already shows the superiority 
of our approach over various alternatives. In Section 4, we demonstrate that 
the advantage carries over to the general N-body problem. Section 5 sums up. 



2 Block Time Steps 

2. 1 Implicit Iterative Time Sym,metrization 



It is surprisingly easy to introduce a time-symmetric version for any adaptive 
self-starting integration scheme. Let ^ = (r, v) be the 2A^-dimensional phase 
space vector for a system with A^ degrees of freedom, and let /(^j, 5ti) be the 
operator that maps the phase space vector of the system at time ti to a new 
phase space vector at time tj+i = tj+5tj. Any choice of self-starting integration 
scheme, together with a recipe to determine the next time step (5tj, at time t, 
and phase space value ^i(ti), defines the precise form of /(,^j, 5ti). 



The recipe for making any such scheme time-symmetric was given bv lHut. Makino fc McMillan 



f|l99.5h . 



as: 



6+1 = f{^i,Sti), 



5t 



where h{^) can be any time step criterion. Note that this recipe leads to an 
imphcit integration scheme, which can be solved most easily through iteration. 
In practice, one or two iterations suffice to get excellent accuracy, but at 
the cost of doubling or tripling the number of force calculations that need 
to be perforr aed. Extensions of t his i mplicit symmetri zation idea have been 
presented bv JFunato et al.1 (jl99fit ) and JHut et all (jl997n . 



Since we will need to inspect the idea of iteration below in more detail, let us 
write out the process here explicitly. We start with the given state C,i and the 
implicit equation for ^j_|_i of the form 



Ci+i = f{^i,Sti{^i,^i+i)) 



The ffist guess for ^j+i is 



r(0) 



c7i = /te,5t,te,e,)) 



(3) 



and we can consider this as our zeroth-order iteration. With this guess in hand, 
we can now start to iterate, finding 



d+\ = /te,5t,(6,d?i)) 



(4) 



as our ffist-order iteration. This will already be much closer to the final value, 
as long as the time steps are small enough and the function 6ti does not 
fluctuate too rapidly. In general, the fc*'' iteration will yield a value for C,i+i of 



4k) 



(fc-l)^ 



e):; = /te,5t.te,e^ro) 



(5) 



We will now consider the application of these techniques to block time steps. 
For the purpose of illustrating the use of block time steps, it will suffice to 
use the leapfrog scheme (also known as the Verlet-Stormer-Delambre scheme, 
according to the authors who rediscovered this scheme at roughly century-long 



intervals), which we present here in a self-starting, but still tinie-synimetric 
form: 



Vi+i = ri + ViSt + ai{5tY/2 
Vij,i = Vi + {ai + ai+i)6t/2 



(6) 



All our considerations carry over to higher-order schemes, as long as the base 
scheme can be made time-symmetric when iterated to c onvergence. An exam- 
ple of such a scheme is the widely used Hermite scheme f Makind . Il991l l. 



2.2 Flip-Flop Problem 



To start with, we apply the recipe of lHut. Makino fc McMillanI ( 19951 1 to block 
time steps. Let us define a block time step at level n as having a length: 



At. 



Ati 

On— 1 ■ 



(7) 



where Ati is the maximum time step length. Starting with the continuum 
choice of 



St, 



H^i) + h{^i, 



we now force each time step to take on the block value 5ti = At„ for the 
smallest n value that obeys the condition At„ < 6tc,i. In more formal terms, 
6ti = Sti{Stc,i) = Atn for the unique n value for which 



n = min < k 

k>l 



1 ^ /ife) + /ife+i) 



2k-i 



(9) 



The problem with this approach is that we are no longer guaranteed to find 
convergence for our iteration process, as can be seen from the following exam- 
ple. Let h{^i) = 0.502 and let the time derivative of h{C,i{t)) along the orbit 
be [d / dt)h{^i{t)) = —0.01. We then get the following results for our attempt 
at iteration. 



= /(6,<5t,(0.502)) = /(^,,0.5) 



(10) 



= /(e^, (5t,([0.502 + (0.502 + 0.5 * (-0.01))]/2)) 

= /te,(5t,([0.502 + 0.497]/2)) 

= /te,5t,(0.4995)) = /te,0.25) (11) 



d+\ = /te, 5t,(6, e^+i)) = /(6, '5t.([/i(6) + Kil,)]/2)) 
= /(^i, (5ti([0.502 + (0.502 + 0.25 * (-0.01))]/2)) 
= /(6,5t,([0.502 + 0.4995]/2)) 
= /(e.,(5t,(0.50075)) = /(e.,0.5) (12) 

And from here on, Q_^_\ = /(^j, 0.25) for every odd value of k and Q_^\ = 
f{C,i, 0.5) for every even value of k: the process of iteration will never converge. 

Under realistic conditions, for slowly varying h functions and small time steps, 
this flip-flop behavior will not occur often, but it will occur sometimes, for a 
non-negligible fraction of the time. We can see this already from the above 
example: for a linear decrease in the h function of {d/dt)h{C,i{t)) = —0.01, we 
will get flip-flopping not only for h{^i) = 0.502 but for any value in the flnite 
range 0.50125 < h{^i) < 0.5025. 

Since iteration converges correctly over the rest of the interval 0.5 < h{C,i) < 1, 
we conclude that in this particular case flip-flopping occurs about one quarter 
of one percent of the time, over this interval. This is far too frequent to be 
negligible in a realistic situation. 

Clearly, a straightforward extension of the implicit iterative time symmetriza- 
tion approach does not work for block time steps, because iteration does not 
converge. We have to add some feature, in some way. Our flrst attempt at a 
solution is to take the smallest of the two values in a flip-flop situation. 



2.3 Flip-Flop Resolution 



The most straightforward solution of the flip-flop dilemma is like cutting the 
Gordian knot: we just take the lowest value of the two alternate states. The 
drawback of this solution is that in general we need at least two iterations 
for each time step, to make sure that we have spotted, and then correctly 
treated, all flip-flop situation. In general, it is only at the third iteration that it 
becomes obvious that a flip-flop is occurring. To see this, consider the previous 
example with a starting value of h{^i) = 0.501. In that case we will get Q_^!l = 
/(^j,0.5) and Q_^:i = f{^i, 0.25), just as when we started with h{^i) = 0.502. 



(2) 

The difference shows up only at the second iteration, where we now find Q,^:^ = 
f{C,i, 0.25), a value that will hold for all higher iterations as well. 

The original iterative approach to time symmetry in practice already gives 
good results when we use only one iteration. This implies a penalty, in terms 
of force calculations per time steps, of a factor two compared to non-time- 
symmetric explicit integration. Now the use of fiip-fiop resolution will force us 
to always take at least two iterations per step, raising the penalty to become 
at least a factor of three. 

However, there is a more serious problem: there is still no guarantee that 
taking the lowest value in a fiip-fiop situation leads to a time-symmetric recipe. 
In fact, what is even more important, we have not yet checked whether our 
symmetric block time-step scheme is really time symmetric, in the absence of 
fiip-fiop complications. 

In order to investigate these questions, let us return to the example we used 
above, but instead of a linear time derivative, let us now use a quadratic time 
derivative for the h function that gives the estimate for the time step size. 
Rather than writing a formal definition, let us just state the values, while 
shifting the time scale so that t = coincides with the particle position being 

6: 



/i(0.00) = 0.502 
/i(0.25) = 0.499 
/i(0.50) = 0.499 (13) 

When we start at time t = 0, and we integrate forward, we find: 

= /(e.,(5t,(0.502)) = /te,0.5) (14) 



eS = /(6, sua. e°+i)) = /te, SUim,) + h{C+i)]m) 
= f{^i,6ti{[0.502 + 0A99]/2)) 
= /(6,5t,(0.5005)) = /(6,0.5) (15) 



and so on: all further kth iterations will result in Q_^:j^ = f{^i, 0.5). There is no 
fiip-fiop situation, when moving forward in time. 

However, when we now turn the clock backward, after taking this step of 
half a time unit, we start with the value /i(0.50) = 0.499, which leads to a 



first step back of 5t = 0.25. The end point of the first step back is t = 0.25 
with /i(0.25) = 0.499. Therefore, also here there is no fiip-fiop situation: all 
iterations, while going backward, result in a time step size of St = 0.25. 

We have thus constructed a counter example, where forward integration would 
proceed with time step 6t = 0.5 and subsequent backward integration would 
proceed with time step 6t = 0.25. Clearly, our scheme is not yet time sym- 
metric, even in the absence of a fiip-fiop case. 



2.4 A First Attempt at a Solution 



Let us rethink the whole procedure. The basic problem has been that the 
very first step in any of our algorithms proposed so far has not been time 
symmetric. The very first step moves forward, and leads to a newly evolved 
system at the end of the first step. Only after making such a trial integration, 
do we look back, and try to restore symmetry. However, as we have seen, the 
danger is large that this trial integration is not exhaustive: it may already go 
too far, or not far enough, and thereby it may simply overlook a type of move 
that the same algorithm would make if we would start out in the time-reversed 
direction. 

Formulating the problem in this way, immediately suggests a solution. At any 
point in time, let us first try to make the largest step that is allowed. If that 
step turns out to be too large for our algorithm, we try a step that is half that 
size, if that step is too large still, we again half the size, and so on, until we 
find a step size that agrees with our algorithm, when evaluated in both time 
directions. A similar treatment has been described by Quin et al (1997). 

This type of approach is clearly more symmetric than what we have attempted 
so far. Instead of using information of the physical system at the starting point 
of the next integration step, we only use a mathematical criterion to find the 
largest time step size allowed at that point, and we then apply the physical 
criteria symmetrically in both directions. 

Let us give an example. If the largest time step size is chosen to be unity, then 
at time t = we start by considering this time step. We try, in this order 
6t = 1, 6t = 0.5, 6t = 0.25, and so on, until we find a time step for which 
integration starting in the forward direction, and integration starting in the 
backward direction, both result in the new time step being acceptable. Let us 
say that this is the case for 6t = 0.125. 

After taking this step, we are at time t = 0.125. The largest time step allowed 
at that point, forward or backward, is 6t = 0.125. Any larger time step would 
result in non-alignment of the block time steps: in the backward direction it 



would jump over t = 0. So at this point we start by considering once more 
5t = 0.125. If that time step is too large, we try half that time step, halving 
it successively until we find a satisfactory time step size. 

Imagine that the second time step size is also 6t = 0.125. In that case, we land 
at t = 0.25. From there on, the maximum allowed time step size is 6t = 0.25, 
so the first try should be that size. 

In principle, this approach seems to be really time symmetric. However, there is 
a huge problem with this type of scheme, as we have just formulated it. Imagine 
the system to crawl along with time steps of, say St = 1/1024, and reaching 
time t = 1. Our new recipe then suggests to start by trying 6t = 1, a 1024-fold 
increase in time step! Whatever subtle physical effect it was that forced us to 
take such small time steps, is completely ignored by the mathematical recipe 
that forces us to look at such a ridiculously large time step. 

For example, in the case of stellar dynamics, a double star may force the stars 
that orbit each other to take time steps that are necessarily far shorter than 
the orbital period. Starting out with a trial step size that is far larger than 
an orbital period may or may not give spuriously safe-looking results. Clearly, 
we have to exclude such enormous jumps in time step. 



2.5 A Second Attempt at a Solution 



The simplest solution to taming sudden unphysical increases in time steps 
is to allow at most an increase of a factor two, in either the forward or the 
backward direction. This then implies that we can only allow decreases of a 
factor two, and not more than two, in either direction. The reason is that a 
decrease of a factor four in one direction in time would automatically translate 
into an increase of a factor four in the other direction. 

Note that we have to be careful with our time step criterion. If we allow time 
steps that are too large, we may encounter situations where our time step 
criterion would suggest us to shrink time steps by a factor of four, from one step 
to the other. Since our algorithm does not allow this, we can at most shrink by 
a factor of two, which may imply an unacceptably large step. However, if our 
time step criterion is sufficiently strict, allowing only reasonably small time 
steps too start with, it will be able to resolve the gradients in the criterion in 
such as way as to handle all changes gracefully through halving and doubling. 

When we apply this restriction to the scheme outlined in the previous subsec- 
tion, we arrive at the following compact algorithm. 

First a matter of notation. Any block time step, of size 6t = 1/2^, connects 



two points in time, only one of which can be written at t = Z /2'^^~^\ with Z 
an integer. Let us call that time value an even time, from the point of view of 
the given time step size, and let us call that other time value an odd time. To 
give an example, if 5t = 0.125, than t = 0, 0.25, 0.5, 0.75, 1 are all even times, 
while t = 0.125, 0.375, 0.625, 0.875 are all odd times. 



') 



Here is our algorithm: 

When we start in a given direction direction in time, at a given point in 
time, we should determine the time step size of the last step made by the 
system. In that way, we can determine whether the current time is even or 
odd, with respect to that last time step. 

If the current time is odd, our one and only choice is: to continue with the 
same size time step, or to halve the time step. First, we try to continue with 
the same time step. If, upon iteration, that time step qualifies according to 
the time-symmetry criterion used before, Eq. 9, we continue to use the same 
time step size as was used in the previous time step. If not, we use half of 
the previous time step. 

If the current time step is even, we have a choice between three options 
for the new time step size: doubling the previous time step size, keeping it 
the same, or halving it. We first try the largest value, given by doubling. 
If Eq.9 shows us that this larger time step is not too large, we accept it, 
otherwise we consider keeping the time step size the same. If Eq.9 shows 
us that keeping the time step size the same is okay, we accept that choice, 
otherwise we just halve the time step, in which case no further testing is 
needed. 

Note that in this scheme, we always start with the largest possible candidate 
value for the time step size. Subsequently, we may consider smaller values, 
but the direction of consideration is always from larger to smaller, never from 
smaller to larger. This guarantees that we do not run into the flip-flop problem 
mentioned above. 



3 Numerical Tests for the 2-Body Problem 



We present here the results for a gravitational two-body integration. The rel- 
ative orbit of the two point masses forms an ellipse with an eccentricity of 
e = 0.99. We have chosen a time unit such that the period of the orbit is 

T = 2tt. 

We have implemented four different integration schemes: 
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Fig. 1. Relative energy errors for a two-body integration of a bound orbit with 
eccentricity e = 0.99. The top hne with highest slope corresponds to algorithm 1, 
the line with intermediate slope corresponds to algorithm 2, and below those the 
two lines for algorithms and 3 are indistinguishable in this figure. 

0) the original time-symmetric integration scheme described bv lHut. Makino fc McMillanI 
(19951), where there is a continuous choice of time step size. This is the ap- 
proach described in section 2.1. We have used five iterations for each step. 



1) a block-time-step generalization, with a fixed number of iterations. This is 
the approach analyzed in section 2.2. Here, too, we chose five iterations for 
each step. 
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Fig. 2. Relative energy errors at apocenter. The four lines, from top to bottom, 
correspond to algorithms 1, 2, 3, and 0. 

2) a block time step generalization, with a variable number of iterations. If 
after five iterations, the fourth and the fifth iterations still give a different 
block time step size, then we choose the smallest of the two. This recipe 
avoids fiip-fiop situations. It is the approach described in section 2.3. 

The algorithm described in the next section, 2.4, we have not implemented 
here, because it is guaranteed to lead to large errors in those cases where a new 
large time step is allowed again just before pericenter passage. We therefore 
switched directly to the following section: 



11 



0.01 



AE 



5x10" 








Fig. 3. Same as Fig. 2, but for a duration that is ten times longer. 

3) the implementation of our favorite algorithm, where we start with a truly 
time symmetric choice of time step, with the restrictions that we only allow 
changes of a factor two in the direction of increasing and decreasing the time 
step, and that we only allow an increase of time step on the so-called even 
time boundaries. This is the approach given in section 2.5. 

In figures 1 and 2 we show the results of integrating our highly eccentric 
binary with these four integration schemes. In each case, the largest errors are 
produced by algorithm 1), smaller errors are produced by algorithm 2), and 
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even smaller errors appear with algorithm 3). Finally, algorithm 0) gives the 
smallest errors. 

Figure 1 shows the energy error in the two-body integration as a function 
of time. As is generally the case for time-symmetric integration, the errors 
that occur during one orbit are far larger than the systematic error that is 
generated during a full orbit. To bring this out more clearly, figure 2 shows 
the error only one time per orbit, at apocenter, the point in the orbit where 
the two particles are separated furthest from each other, and the error is the 
smallest. 

Finally, figure 3 shows the same data as figure 2, but for a period of time that 
is ten times longer. In both figures 2 and 3, it is clear that the first two block 
time steps algorithms, 1) and 2), both show a linear drift in energy. This is 
a clear sign of the fact that they violate time symmetry. Note that in both 
figures algorithm 3) gives rise to a time dependency that looks like a random 
walk. This may well be the best that can be done with block time steps, when 
we require time symmetry. 



4 N-body implementation 



So far, we have discussed the implementation of our block-symmetric algo- 
rithm for individual time steps in the case of the two-body problem. Of course, 
for A^ = 2, it is not really necessary to use block time steps, nor is it useful to 
introduce individual time steps. The reason we made both of these extensions 
was to implement and test our basic ideas in the simplest case. In this section, 
we describe and test our algorithm for the general A^-body problem. 



4-1 Divide and Conquer: the Concept of an Era 



The major conceptual difficulty in designing a time-symmetric block step 
scheme is the global context information that is needed, with extensions to- 
ward the future as well as the past. In order to determine the time step for 
particle i at time t, we need the information of all other particles j at that 
time. In general, there will be at least some j values for which the position 
and velocity of particle j are not given at time t, because particle j has a time 
step larger than particle i and time t happens to fall within the duration of 
one time step for particle j. 

In such a case, the time step of particle i at time t depends on the positions 
and velocities of other particles j, that can only be determined from time 
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symmetric interpolation between the positions and velocities of each particle j 
at times earlier and later than t. However, the future j positions and velocities 
depend in turn on the orbit of particle i, and thus on the time step of particle 
i at time t. In other words, there is a circular dependence between the future 
positions and velocities of particles j and the time step of particle i. 

To make things worse, each of the future positions and velocities of any of the 
particles in turn will depend on information that is given even further in the 
future. If we continue this logic, we would have to know the complete future 
of a whole simulation, before we could attempt to time symmetrize that whole 
history. And while any simulation will stop at a finite time, so the number of 
time steps for each particle will be a finite number, it is clearly unpractical 
to let the very first time step depends on the positions and velocities of the 
particles at the very end of simulation. 

A more practical solution is to impose a maximum size for any time step, as 
^tmax- If we start the simulation at time to, we know that all particles will 
reach time ti = to + ^imaxi by making one or more steps. At that time, all 
particles will be synchronized. This means that we can focus on time symmet- 
ric orbit integration for all particles during the interval [to,ti], without the 
need for any information about any particle at any time t > ti. 

In other words, we divide and conquer: we split the total history of our simu- 
lation into a number of smaller periods, which we call eras. Each era extends 
a period in time equal to the largest allowed time step Atmax, or to an integer 
multiple of At„iax, whatever turns out to be the most convenient. 



4-2 Era- Based Iteration 



Let the beginning of a single era be to and the end ti. As we saw above, in 
order to obtain the time step size for particle i at time t within our era, we 
typically need to know the positions and velocities of some other particles at 
times larger than t. The simplest way to provide this future information is 
through iteration. 

First we just perform standard forward integration, with the usual kind of 
non-time-symmetric block step algorithm, for the complete duration of our 
era (to < t < ti). While simultaneously integrating the orbits of all parti- 
cles, we store the positions and velocities (and if necessary higher order time 
derivatives) for all time steps for all particles during our era. This will then 
allow us to obtain the position and velocity of any particle at any arbitrary 
time through interpolation, to the accuracy given by this first try, which will 
function as our zeroth iteration. 
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Next we make our first iteration. We again perform orbit integration for our 
complete A^-body system, during to < t < ti. However, tliere are two differ- 
ences witli respect to tlie first try. First of all, in order to calculate the force 
from particle j on particle i, we no longer extrapolate the orbit of particle j to 
the time requested by i, but instead we interpolate the position and velocity of 
particle j to the requested time, using stored positions and velocities of parti- 
cle j at slightly earlier and later times, using a time symmetric interpolation 
scheme. Secondly, we can now begin to symmetrize the time step for particle i, 
in the same way as we did it for the two-body problem in the previous section, 
with one exception: we now obtain the estimated time step size at the begin- 
ning of the time step from the current iteration, and we obtain the estimated 
time step size at the end of the time step from the previous iteration (in the 
two-body case the iteration was done separately for each step). 

Subsequent attempts, as second and higher iterations, repeat the same steps 
as the first iteration. 

As before, we have implemented our iteratively time symmetric block step 
algorithm using the leapfrog algorithm as our basic integrator. Generalizations 
to higher-order schemes are somewhat more complex, but follow the same basic 
logic we are outlining in the current paper. We have adopted the following time 
step criterion for particle i: 

\r ' '\ 
6ti = rj min,-^, (16) 

where 77 is a constant parameter and r^j and Vij are the relative position and 
velocity between particles i and j. To symmetrize the time step, we simply 
require that the step sizes that would be determined by the above criterion at 
both ends of the time step are not smaller than the actual time step used. In 
other words, we take the minimum of the two time step values that our crite- 
rion gives us at the beginning and at the end of the time step; this minimum 
is our symmetrized time step. We could have taken the average, but here we 
have used the minimum value, for simplicity. 



4.3 Numerical Results for N = 100 and N = 512 



Figure 4 show how the energy errors grow in the 100-body problem. In each 
case, we started with random realizations of a Plummer model, where we used 
standard A^-body units, in which the gravitational constant G = 1, the total 
mass M = 1 and the total energy is Etot = —1/4. We have integrated each 
system for 50 time units, with a maximum time step of 1/64 and 77 = 0.1 (see 
Eq. 16). We have used standard Plummer type softening with softening length 
e = 0.01. We have carried out forty time integrations, starting from twenty 
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Fig. 4. Growth of the relative energy error for 100-body runs, starting from twenty 
different sets of initial conditions. For each set of initial conditions, two integrations 
have been performed, one without and one with time-symmetrization (in the latter 
case, using six iterations). The twenty lines with time symmetrization form the 
horizontal bundle which is slowly spreading in square-root-of-time fashion like a 
random walk; the twenty lines without time symmetrization all show a systematic, 
near-linear decrease in energy. 

different realizations of the Plummer model. For each realization, we have 
integrated the system once without any time symmetrization and once with 
time symmetrization using six iterations to guarantee sufficient convergence. 
In our experience, at least three iterations were necessary to achieve high 
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accuracy. 

It is clear that all runs without time symmetry show a systematic drift in 
energy, while no such systematic tendency is visible for time symmetrized 
runs. Among the twenty non-symmetrized runs, even the best result came 
out worse than the worst result among the symmetrized runs. We conclude 
that time symmetrization can significantly improve the long-term accuracy of 
A^-body simulations. 

Figure 5 shows similar results for 512-body runs. Here we have started from 
a single Plummer model realization, and the curves show the effect of varying 
the number of iterations. In this case, the second and third iteration already 
show a dramatic improvement in the long-time behavior of the total energy 
of the system. The softening used here is e = 1/512. All other parameters are 
the same as for the 100-body runs. For the 512-body case, the improvement is 
significantly better than it was in the 100-body runs. Finally, figure 6 shows 
that the results depicted in figure 5 are generic: starting from a different set 
of initial conditions changes the details but not the overall picture, and again 
the second and third iterations show dramatic improvements over the original 
run and the first iteration. 

We offer the following explanation for the increase in accuracy of the time- 
symmetric scheme as a function of particle number. In these runs, contri- 
butions to the error are largely generated by close encounters between two 
particles, since the softening we used is relatively small. These error contri- 
butions are dominated by weak encounters, since the softening, while small, 
is not small enough to make gravitational focusing significant. The number 
of weak encounters that take place during one time unit, within a particle- 
particle distance that is less than the inter-particle distance (of order N^^^) 
is of order 0{N^^^). If our time symmetrization succeeds in replacing the sys- 
tematic effects of all these encounters by a random effect, the result will be a 
shift from a linear to a square root drift, effectively replacing the 0{N'^^^) de- 
pendence by a 0{N'^^^) dependence. We conclude that the relative reduction 
in the total energy error, due to time symmetrization, grows with A^, as N"^^^. 

Whether or not time-symmetric integration is to be preferred, depends on 
the number of particles and the duration of the integration. In the example 
case shown in figure 4, the energy error is about a factor of 10 smaller for 
time-symmetric integration, but the same effect could be achieved in a com- 
putationally cheaper way by reducing the step size by a factor of ^/To. For 
large A^, however, as shown already in figure 5, the effect of time symmetry is 
more pronounced. 

For very long integrations, time-symmetric integration is clearly better, since 
in that case the error grows in random-walk fashion, as y/t, while for any non- 
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Fig. 5. Growth of the relative energy error for 512-body runs, starting from a single 
set of initial conditions, but using a different number of iterations. The lowest curve 
presents an integration without time symmetrization. The curve above that presents 
the result of time symmetrization using only one iteration. The next two curves 
show the results of using three and two iterations, respectively; initially, the third 
iteration curve rises a bit above the second iteration curve. 

time-symmetric scheme scheme the error grows linearly, proportional to t. In 
the case of a star-cluster simulation which covers many relaxation timescales, 
the particles in the core can go through a very large number of crossing times. 
For example, a simulation of a globular cluster with 10^ stars would need to 
cover at least 10^ half-mass crossing times. The crossing timescale in the core 
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Fig. 6. Growth of the relative energy error for 512-body runs, hke figure 5, but 
starting from a different set of initial conditions. As before, the lowest curve presents 
an integration without time symmetrization, and the curve above that presents the 
result of time symmetrization using only one iteration. The second iteration curve 
is the one that stays above the third iteration curve for most of the run depicted 
here. 

is at least a factor of 100 shorter than that of the half-mass crossing time. 
Therefore, particles in the core need to be followed for as many as 10^ local 
crossing times. The difference between the scaling of \/t and t produces a 
improvement of a factor of roughly 10^'^ in the energy error, in the case of 
time-symmetric integration. With a second-order scheme, this translates into 
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a factor of 100 difference in the necessary step size. Even with a fourth-order 
scheme, it would imply a difference of a factor of ten in the time step. Clearly, 
even applying five iterations will produce a gain of a factor two with respect 
to the alternative of having to decrease the time step by a factor of ten. 



4.4 Description of the Algorithm 



Here we describe the algorithm in some more detail. To enable the reader 
to check the precise implementation of our algorithm, we have made avail- 
able the computer code used to generate figures 4, 5, and 6, on our Art of 
Computational Science web site ^ 

Consider the integration of the system from time to to ti. We can assume for 
simplicity that the zero point in time has been chosen in such a way as to be 
compatible with the era size, so that both to and ti are integer multiples of 
the time interval ti — to. 

In the first pass through this era, we integrate all particles with the standard 
block step scheme, without any intention to make the scheme time symmetric. 
In the calculations reported in this paper, we have used the same predictor- 
corrector form of the leapfrog algorithm as mentioned earlier in Eq. 6: 



1 A 2 

r„e«; = rold + Vold^t + -aoid^t , 

1 

Vne«; = Void + -{'^old + a„e«,)At. (17) 



As we discussed before, if one particle i wants to step forward in time, we need 
to know the positions of all particles j in order to compute the force that each 
particle j exerts on particle i. Among the particles j, many may have a larger 
time step than particle i, so we may have to predict the position for such a 
particle, for the time at which particle i wants to make a step. In addition, 
we need to predict the velocity of particle j, because the velocity difference 
between particles i and j are used in determining the time step size, according 
to Eq. 16. 

The predicted position rp^ j for particle j is obtained with a second-order Taylor 
expansion, while for the predicted velocity Vp^ j a first-order expansion suffices: 



http://www.ArtCompSci.0rg/kali/vol/block_symmetric/ccode/nbody.c 
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rp, j = rj + Vj{t - tj) + -aj{t - tjf, 



The integrated positions and velocities for each particle i at each time step, 
Vnew and Vnew in Eq- 17, are all stored. 

In the next iteration, we proceed in the same way, but the time step is calcu- 
lated differently. At time t for particle i, let the time step calculated according 
to the criterion (16) be 6t, and the time step not exceeding this St and com- 
patible with the block step criterion be Atp. But now we would like to know 
the time step size that would be required at the end of this step, at time 
t' = t + Atp. 

There are two possibilities. If particle i ended a time step at time t' in the 
previous iteration, we are in luck, and we can use the same procedure we used 
in the two-body case. To be specific, we test if the time step according to 
criterion (16) is smaller than Atp] if so, we halve the value of Atp. However, 
if we are not in luck, and we do not have time t' in the list of times for which 
particle i was integrated in the previous iteration, we simply adopt the time 
step that we obtained looking in the forward direction. In that case, we have 
to wait till a next iteration, to apply the time symmetrization procedure. This 
occasional miss explains why the iteration procedure often requires several 
iterations before accurate time symmetry can be obtained. 

To calculate the force at the new time for particle i, we use the positions of the 
other particles j calculated by interpolation, based on the previous iteration, 
in the following way. The interpolation itself is done in a straightforward, 
linear way. However, since we have a more accurate position at hand for at 
least the starting point of each orbit segment, for particle j, we may as well 
correct the old orbit segment by shifting it rigidly by an amount equal to the 
difference between the starting point of the current and the previous iteration. 

To be specific, the interpolated position at time t for particle j is given by 

rp = (1 - f)rs + fr, + Ar„ (19) 

where / = {t — ts) / (te — tg) , tg is the largest time in the list of times for particle 
j not exceeding t and tg is the next time in that list, immediately following 
tg- Here both r^ and r^ are obtained from the stored results from the previous 
iteration. The correction term Avg is defined as 

where rs,new is the position of particle j at time tg in the current iteration. 
As in the case described above, sometimes we are unlucky, and Vg^new is not 
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available. In that case we just set Ar^ to be zero, postponing further accuracy 
improvement until the next iteration. 

To make our predictor-corrector form of the leapfrog integration scheme con- 
sistent, we use the same interpolation scheme for the predictor part, for the 
particles to be integrated. For the corrector part, we used the trapezoidal 
scheme, as follows: 



Vc,new — Void + -^tio-old + Q-new), (21) 

renew = Told + -At{Vold + Vc,new)- (22) 



Here, the subscript old refers to the value at the previous time, the subscript 
c,new refers to the corrected value at the new time, aoid is calculated with the 
old values for the positions, and anew is calculated with the predicted values 
of the positions. Note that the first corrected quantity that can be computed 
is Vc^new, bascd on the old and predicted quantities. After that, we can also 
compute the corrected quantity r^newi based on the old quantities and Vcnew- 



5 Discussion 



We have succeeded in constructing an algorithm for time symmetrizing block 
time steps that does not show a linear growth of energy errors. As far as we 
know, this is the first such algorithm that has been discovered. We expect 
this algorithm to have practical value for a wide range of large-scale paral- 
lel N-body simulations. While we have illustrated our approach for simplicity 
with the leapfrog scheme, all our considerations carry over to higher-order 
schemes, as long as the base scheme can be made time-symmetric when iter- 
ated to convergence. A n example of such a scheme is the widely used Hermite 
scheme ( Makinol . 1 1 9 9 1 ) . We plan to discuss such applications in a future paper. 



A major novelty in our scheme has been the introduction of a time period, 
which we can an era, during which all positions and velocities of all particles 
are stored in memory. These values are retained from one iteration to the 
next. We expect that this procedure will have other advantages as well, in 
that it prevents sudden surprises to occur. For example, if a particle will 
suddenly require a very high speed, it may approach another particle with a 
long time step without any warning. As another example, a star may undergo 
a supernova explosion, something that other particles will normally only notice 
when there current time step has finished. In both cases, after the first iteration 
all particles will have access to full knowledge about these unexpected events. 
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and during the iteration procedure, they can automatically adapt to the new 
situation. 



Our new scheme promises to be competitive with traditional non-symmetrized 
schemes, especially for very long integration times, in that the same error 
bounds may be reached using less computer time. To prove that this will be 
the case for realistic applications clearly requires further detailed investiga- 
tions, beyond the scope of the current paper. The memory use of our scheme 
may seem formidable, and indeed, when a large value for the era size is cho- 
sen, memory use is increased significantly over traditional schemes. For those 
N-body calculations that are CPU time limited, this may not be much of a 
concern. However, for large-scale cosmological simulations and other applica- 
tions for which memory is important, it is possible to choose an era size in 
such a way that the memory requirement of the new scheme is less than twice 
the memory requirement of non-symmetric schemes, at a CPU performance 
penalty of less than a factor two. Here the trick is to take an era size close 
to the harmonic mean of the time steps of all particles. In that way, half the 
computing cost of the non-symmetric scheme is associated with particles that 
have time steps shorter than this era size. Those particles with natural time 
steps longer than this era choice will see their step size shortened to the era 
size, but the total increase in time steps will be less than a factor two. 



The reason that our scheme shows a dramatic improvement in accuracy is time 
symmetry, which suppresses linear error growth. The reason that our scheme 
is somewhat complicated is purely empirical: all else failed, in our attempts 
to try simpler schemes. Whether our procedure is the simplest scheme that 
actually can produce time symmetric versions of block time step codes is an 
open question. It may well be, but we certainly have no mathematical proof. 
This is an interesting question to be pursued further, for theoretical as well as 
practical reasons. 
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